Moscow foodservice market

Investors are planning to open a foodservice place in Moscow.

Research objective:

prepare a study of the Moscow foodservice market, find interesting features and present the results obtained, which will help in choosing a suitable place for investors.

Research tasks:

Give recommendations on:

Available information:

a dataset with foodservice places in Moscow, compiled on the basis of data from Yandex Maps and Yandex Business services for the summer of 2022. The information posted in the Yandex Business service could have been added by users or found in publicly available sources. It is purely for reference purposes.

The research process:

  1. General Information Study.
  2. Data preprocessing.
  3. Data analysis.
  4. Detailed research.
  5. General conclusion.
  6. Presentation.

General Information Study

Conclusions

The dataset contains information about 8,406 places. It is clear from the first lines that there are missing values.

There are fourteen columns in the dataset in total:

The columns name, category, address, district, hours, price, avg_bill have object type; lat, lng, rating, middle_avg_bill, middle_coffee_cup, seats have float64 type; chain has int64 type.

We can say there is enough data for analysis. We will prepare them for further work.

Data preprocessing

Checking for duplicates

There are no duplicates.

Processing of abnormal values

The minimum value of the average bill looks abnormal, the maximum may be explained by the fact that there are restaurants in the dataset focused on banquets. The maximum cost of a cappuccino cup and the number of seats also look abnormal.

Checking the minimum value in the column middle_avg_bill.

The owners of "Coffeemania" on Novy Arbat str. did not specify the average bill. Let's look at the median average score in other cafes of this chain in the Central Administrative District of Moscow and replace the value with the median.

We look at the 95th and 99th percentiles of the cost of a cup of cappuccino.

Coffee shops with the most expensive coffee.

Incorrect value in the "Shokoladnitsa": most likely due to a typo in the maximum cost of a cup of coffee. Let's look at the median average score in other places of this chain in the Eastern Administrative District of Moscow and replace the value with the median.

Let's look at the 95th and 99th percentiles of the number of seats and at the places where there are more seats than the 99th percentile.

Judging by the fact that there are groups of places with a very high number of seats that are located at the same address, these are really anomalies. We will leave them in the dataset: for further work with the seats, we will use median values.

Also we have "Yandex Lavka" in the dataset, which is a food delivery service. Let's see how many such names are in the data.

Delete them from the dataset.

Processing of missing values

Heat map of missing values in percentage ratio.

More than 50% of the missing values in the price, avg_bill, middle_avg_bill columns: the dataset is based on data from Yandex Maps and Yandex Business services, which means that the owners of places or visitors did not add information about prices and the average bill.

The missing values in the middle_avg_bill column are also due to the fact that data from avg_bill does not get into it if they are specified for coffee shops or bars/pubs.

A very high percentage of omissions (94%) in the middle_coffee_cup column is explained by the fact that this column is filled from the avg_bill column if the value starts with the substring "Price of one cup of cappuccino", i.e. mainly for coffee shops, and there are other categories of places in the dataset.

Fill in the missing values as follows:

Fill in the missing values in the price column.

Fill in the missing values in the column middle_avg_bill.

Fill in the missing values in the column middle_coffee_cup.

Checking what percentage of passes were processed.

The percentage of missing values in the price column decreased from 61 to 54, in middle_avg_bill from 63 to 50, and in middle_coffee_cup from 94 to 91.

Adding columns

We add the street column with street names from the address column to the dataset, the is_24/7 column with the logical value True if the place is open daily and around the clock, and False if not.

Earlier we found out that there are no obvious duplicates in the dataset. Let's check if there are any implicit ones: we will bring the columns with the address and name to the same register, replace ё with е, remove the special characters, then check for duplicates again.

There are 4 implicit duplicates in the dataset. Delete them.

Conclusions

One abnormal value was processed: the average bill of 0.00 in one of the coffee shops of the 'Shokoladnitsa' chain was replaced by the median bill in the coffee shops of this chain in the same administrative district. Removed 4 duplicates.

There are a large number of missing values in the dataset. More than 50% of the missing values in the columns price, avg_bill, middle_avg_bill are explained by the fact that the dataset is based on data from Yandex Maps and Yandex Business services, which means that the owners of places or visitors have not added information about prices and the average bill.

The missing values in the middle_avg_bill column are also due to the fact that data from avg_bill does not get into it if they are specified for coffee shops or bars/pubs.

A very high percentage of missing values (94%) in the middle_coffee_cup column is explained by the fact that this column is filled only for coffee shops.

Fill in the missing values as follows:

After processing, the percentage of missing values in the price column decreased from 61 to 54, in middle_avg_bill from 63 to 50, and in middle_coffee_cup from 94 to 91.

Added to the dataset column street with street names, column is_24/7 with the designation that the place is open daily and around the clock.

Data analysis

Number of places by category

Let's see which categories of places and how many of them are represented in the data.

Cafes and restaurants are represented the most in the dataset — 2,378 and 1,971 places, respectively. The least number of bakeries, 256.

Number of seats in places

Box-and-whisker diagram.

Restaurants, bars/pubs and coffee shops are in the lead in terms of the number of seats: the median number of seats is 90, 82 and 80, respectively. On the boxplot, we see outliers: in each category there are places, the number of seats in which significantly exceeds the median.

The ratio of chain and non-chain places

There are 63.2% of non—chain places in the dataset, 36.8% of chain places.

Which categories of places are more often chains?

As we can see on the graph, there are three categories of places that are more often chains: coffee shops, pizzerias and bakeries.

We will consider as a chain those places for which the value 1 is indicated in the chain column, which have the same name and the same category. Popularity will be assessed by the number of places in the chain.

We check how the top 15 is distributed by category.

Domino's Pizza and Dodo Pizza chains are leading among the popular chains. The top 15 chains do not include pubs/bars, fast food places, canteens.

Let's take a look at the total number of places from the top 15 and the number of places of each category by district.

It is expected that most of the places from the top 15 are in the Central Administrative District of Moscow. Least of all, 33, in the North-West.

Places in Moscow districts

Which administrative divisions of Moscow are present in the dataset?

In total, 9 administrative districts of Moscow are represented in the dataset. Let's look at the total number of places and the number of places of each category in these districts.

The Central Administrative District of Moscow is also the leader in the total number of places.

Distribution of average ratings by categories of places

Let's look at the boxplot.

Almost all categories of places have outliers towards low ratings.

The average rating of all categories exceeds 4 points. Bars/pubs have a maximum rating of 4.39, fast food places have a minimum rating of 4.05.

Average rating of each district places

We calculate the average rating of places in each administrative district of Moscow.

We build a choropleth map with an average rating of places in each district.

Places in the Central Administrative District of Moscow have the highest average rating (4.38), the minimum in the South-Eastern (4.1).

Clusters of foodservice places

All places of the dataset displayed on the map using clusters.

There are noticeably more places in the center, north and west of Moscow than in the south and east.

Top 15 streets by number of foodservice places

Mira Avenue is the leader in terms of the number of places in Moscow, 184 places are located on this street.

Let's plot the distribution of the number of places and their categories by the top 15 streets.

The most numerous categories of places on the top 15 streets are cafes, coffee shops and restaurants.

Streets with one foodservice place

We will find streets where there is only one foodservice place, and see what kind of places they are.

There are a total of 458 streets in Moscow, where there is only one foodservice place, most often it will be a cafe.

How does distance from the center affect prices in foodservice places?

We calculate the median of the column with the value of the average check middle_avg_bill for each district and use this value as a price indicator of the area.

Building a choroplet with the values obtained for each district.

The median average bill in the Central Administrative District (950 rubles) is significantly higher than in other districts. The cheapest places are located in the South-Eastern Administrative District (the average bill is 425 rubles).

Opening hours

We study the opening hours of places and their dependence on the location and category of the places.

It is expected that the highest percentage of round-the-clock places are in the "fast food" category, canteens have the lowest percentage.

Let's see in which districts of Moscow there are more round-the-clock places.

The maximum number of round-the-clock places in the Central Administrative District, on the choroplete it is also visible that there are more round-the-clock places in the east of Moscow than in the west.

Characteristic of places with poor ratings

For our study, we will consider a rating below 4 points to be bad. Let's see what percentage of places with a low rating.

We look at how low-rated places are distributed by category.

Fast food places (27%) and cafes (23%) have the highest share of places with a rating below 4 points.

Conclusions

The following can be said about foodservice places in Moscow:

We recommend paying attention to such categories of places as coffee shops and pizzerias: there are fewer of these places than cafes and restaurants – the competition is lower. On the other hand, they are not as limited in format as bars and pubs, which are not suitable, for example, for family holidays.

The Central Administrative District of Moscow is the most saturated with foodservice place – 2,237 establishments. In the city center, the audience is much higher than just the number of residents, due to offices, cultural institutions, etc. To open a new place, we recommend considering the Northern, Northeastern and Southern districts: there is less competition compared to the Central Administrative District, but at the same time there are enough places in these areas, so there is an audience.

As for the average bill, we recommend focusing on the median average bill of places depending on the district. In the Northern, North-Eastern and Southern districts, the median bill is 650 rubles, 500 rubles and 500 rubles, respectively.

Detailed research: opening a coffee shop

Investors are planning to open a coffee shop in Moscow. Let's try to give a recommendation for opening a new place.

Number and location of coffee shops

Let's see how many coffee shops there are in the dataset, in which districts of Moscow there are most of them, what are the features of their location.

Сlusters of coffee shops on the map of Moscow.

As we know from the study of the distribution of places by districts of Moscow, in the Southern Administrative District, the number of places of which is comparable to the Northern and North-Eastern District, there are relatively fewer coffee shops. We can see this on the map as well.

From the point of view of the share of coffee shops among all the places of the district, one can consider the Southern Administrative District of Moscow to open a coffee shop.

24-hour coffee shops

Previously, we studied the opening hours of foodservice places and know that only 4.2% of all coffee shops in Moscow work 24/7.

Let's see how many round-the-clock coffee shops there are and how they are distributed by districts.

There is only one round-the-clock coffee shop in the South-Eastern and Southern districts of Moscow. The largest number of 24—hour coffee shops is in the city center (26).

Coffee shop ratings

Previously, we studied the distribution of average ratings by categories of foodservice places and we know that the average rating of coffee shops is 4.28.

image.png

Let's see how the ratings of coffee shops are distributed by districts.

In Southern District, the average rating of coffee shops is slightly lower than in Moscow as a whole.

The coffee shops in the Central district and the North-Western district have the highest average rating.

The cost of a cup of cappuccino

We analyze what the cost of a cup of cappuccino is worth focusing on when opening a coffee shop. We look at the average price of a cup of cappuccino in coffee shops in Moscow districts.

The most expensive cappuccino in the coffee shops of the Central and Southwestern administrative districts — an average of 187 rubles per cup.

Earlier, we also studied the distribution of chain and non-chain places by category and know that in Moscow the number of chain and non-chain coffee shops is approximately equal, unlike cafes, restaurants and bars, where there are significantly more non-chain establishments.

image-2.png

The average rating of chain and non-chain coffee shops in Moscow districts.

In each administrative district of Moscow, the rating of non-chain coffee shops is higher than the chain ones.

Conclusion

We recommend opening a non-chain 24-hour coffee shop in the Southern Administrative District of Moscow with the cost of a cup of cappuccino from 165 rubles.

General conclusion

  1. We have studied a dataset with information about 8,406 foodservice places in Moscow.

  2. Prepared the dataset for work

The missing values were processed:

We added a column street with street names to the dataset , a column is_24/7 with the designation that the places is open daily and around the clock.

  1. We studied the data and came to the following conclusions about catering establishments in Moscow:
  1. We studied the coffee shops in Moscow and came to the following conclusions:

According to the results of a detailed study, it was recommended to open a non-chain 24-hour coffee shop in the Southern Administrative District of Moscow with the cost of a cup of cappuccino from 165 rubles.